# Upstash Ratelimit Callback

In this guide, we will go over how to add rate limiting based on number of requests or the number of tokens using `UpstashRatelimitHandler`. This handler uses [Upstash's ratelimit library](https://github.com/upstash/ratelimit-js/), which utilizes [Upstash Redis](https://upstash.com/docs/redis/overall/getstarted).

Upstash Ratelimit works by sending an HTTP request to Upstash Redis every time the `limit` method is called. Remaining tokens/requests of the user are checked and updated. Based on the remaining tokens, we can stop the execution of costly operations, like invoking an LLM or querying a vector store:

```tsx
const response = await ratelimit.limit();
if (response.success) {
  execute_costly_operation();
}
```

`UpstashRatelimitHandler` allows you to incorporate this ratelimit logic into your chain in a few minutes.

## Setup

First, you will need to go to [the Upstash Console](https://console.upstash.com/login) and create a redis database ([see our docs](https://upstash.com/docs/redis/overall/getstarted)). After creating a database, you will need to set the environment variables:

```
UPSTASH_REDIS_REST_URL="****"
UPSTASH_REDIS_REST_TOKEN="****"
```

Next, you will need to install Upstash Ratelimit and `@langchain/community`:

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install @upstash/ratelimit @langchain/community @langchain/core
```

You are now ready to add rate limiting to your chain!

## Ratelimiting Per Request

Let's imagine that we want to allow our users to invoke our chain 10 times per minute. Achieving this is as simple as:

```tsx
const UPSTASH_REDIS_REST_URL = "****";
const UPSTASH_REDIS_REST_TOKEN = "****";

import {
  UpstashRatelimitHandler,
  UpstashRatelimitError,
} from "@langchain/community/callbacks/handlers/upstash_ratelimit";
import { RunnableLambda } from "@langchain/core/runnables";
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

// create ratelimit
const ratelimit = new Ratelimit({
  redis: new Redis({
    url: UPSTASH_REDIS_REST_URL,
    token: UPSTASH_REDIS_REST_TOKEN,
  }),
  // 10 requests per window, where window size is 60 seconds:
  limiter: Ratelimit.fixedWindow(10, "60 s"),
});

// create handler
const user_id = "user_id"; // should be a method which gets the user id
const handler = new UpstashRatelimitHandler(user_id, {
  requestRatelimit: ratelimit,
});

// create mock chain
const chain = new RunnableLambda({ func: (str: string): string => str });

try {
  const response = await chain.invoke("hello world", {
    callbacks: [handler],
  });
  console.log(response);
} catch (err) {
  if (err instanceof UpstashRatelimitError) {
    console.log("Handling ratelimit.");
  }
}
```

Note that we pass the handler to the `invoke` method instead of passing the handler when defining the chain.

For rate limiting algorithms other than `FixedWindow`, see [upstash-ratelimit docs](https://upstash.com/docs/oss/sdks/ts/ratelimit/algorithms).

Before executing any steps in our pipeline, ratelimit will check whether the user has passed the request limit. If so, `UpstashRatelimitError` is raised.

## Ratelimiting Per Token

Another option is to rate limit chain invokations based on:

1. number of tokens in prompt
2. number of tokens in prompt and LLM completion

This only works if you have an LLM in your chain. Another requirement is that the LLM you are using should return the token usage in it's `LLMOutput`. The format of the token usage dictionary returned depends on the LLM. To learn about how you should configure the handler depending on your LLM, see the end of the Configuration section below.

### How it works

The handler will get the remaining tokens before calling the LLM. If the remaining tokens is more than 0, LLM will be called. Otherwise `UpstashRatelimitError` will be raised.

After LLM is called, token usage information will be used to subtracted from the remaining tokens of the user. No error is raised at this stage of the chain.

### Configuration

For the first configuration, simply initialize the handler like this:

```tsx
const user_id = "user_id"; // should be a method which gets the user id
const handler = new UpstashRatelimitHandler(user_id, {
  requestRatelimit: ratelimit,
});
```

For the second configuration, here is how to initialize the handler:

```tsx
const user_id = "user_id"; // should be a method which gets the user id
const handler = new UpstashRatelimitHandler(user_id, {
  tokenRatelimit: ratelimit,
});
```

You can also employ ratelimiting based on requests and tokens at the same time, simply by passing both `request_ratelimit` and `token_ratelimit` parameters.

For token usage to work correctly, the LLM step in LangChain.js should return a token usage field in the following format:

```json
{
  "tokenUsage": {
    "totalTokens": 123,
    "promptTokens": 456,
    "otherFields: "..."
  },
  "otherFields: "..."
}
```

Not all LLMs in LangChain.js comply with this format however. If your LLM returns the same values with different keys, you can use the parameters `llmOutputTokenUsageField`, `llmOutputTotalTokenField` and `llmOutputPromptTokenField` by passing them to the handler:

```tsx
const handler = new UpstashRatelimitHandler(
  user_id,
  {
    requestRatelimit: ratelimit
    llmOutputTokenUsageField: "usage",
    llmOutputTotalTokenField: "total",
    llmOutputPromptTokenField: "prompt"
  }
)
```

Here is an example with a chain utilizing an LLM:

```tsx
const UPSTASH_REDIS_REST_URL = "****";
const UPSTASH_REDIS_REST_TOKEN = "****";
const OPENAI_API_KEY = "****";

import {
  UpstashRatelimitHandler,
  UpstashRatelimitError,
} from "@langchain/community/callbacks/handlers/upstash_ratelimit";
import { RunnableLambda, RunnableSequence } from "@langchain/core/runnables";
import { OpenAI } from "@langchain/openai";
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

// create ratelimit
const ratelimit = new Ratelimit({
  redis: new Redis({
    url: UPSTASH_REDIS_REST_URL,
    token: UPSTASH_REDIS_REST_TOKEN,
  }),
  // 500 tokens per window, where window size is 60 seconds:
  limiter: Ratelimit.fixedWindow(500, "60 s"),
});

// create handler
const user_id = "user_id"; // should be a method which gets the user id
const handler = new UpstashRatelimitHandler(user_id, {
  tokenRatelimit: ratelimit,
});

// create mock chain
const asStr = new RunnableLambda({ func: (str: string): string => str });
const model = new OpenAI({
  apiKey: OPENAI_API_KEY,
});
const chain = RunnableSequence.from([asStr, model]);

// invoke chain with handler:
try {
  const response = await chain.invoke("hello world", {
    callbacks: [handler],
  });
  console.log(response);
} catch (err) {
  if (err instanceof UpstashRatelimitError) {
    console.log("Handling ratelimit.");
  }
}
```
